One of the main problems in applying deep learning techniques to recognize activities of daily living (ADLs) based on inertial sensors is the lack of appropriately large labelled datasets to train deep learning-based models. A large amount of data would be available due to the wide spread of mobile devices equipped with inertial sensors that can collect data to recognize human activities. Unfortunately, this data is not labelled. The paper proposes DISC (Deep Inertial Sensory Clustering), a DL-based clustering architecture that automatically labels multi-dimensional inertial signals. In particular, the architecture combines a recurrent AutoEncoder and a clustering criterion to predict unlabelled human activities-related signals. The proposed architecture is evaluated on three publicly available HAR datasets and compared with four well-known end-to-end deep clustering approaches. The experiments demonstrate the effectiveness of DISC on both clustering accuracy and normalized mutual information metrics.
translated by 谷歌翻译
长期或慢性病的人管理是国家卫生系统面临的最大挑战之一。实际上,这些疾病是住院的主要原因之一,尤其是对于老年人,监测它们所需的大量资源导致医疗保健系统可持续性问题。便携式设备和新连接技术的扩散越来越大,可以实施能够为医疗保健提供者提供支持并减轻医院和诊所的负担。在本文中,我们介绍了用于医疗保健的远程监控平台的实现,该平台旨在从不同的消费者移动设备和自定义设备中捕获几种类型的生理健康参数。可以通过Google Fit生态系统将消​​费者医疗设备集成到平台中,该生态系统支持数百个设备,而自定义设备可以通过标准通信协议直接与平台进行交互。该平台旨在使用机器学习算法处理获得的数据,并为患者和医生提供生理健康参数,并提供用户友好,全面且易于理解的仪表板,该仪表板通过时间来监视参数。初步可用性测试在功能和实用性方面表现出良好的用户满意度。
translated by 谷歌翻译
在过去的十年中,通过深度学习方法取得了杰出的结果,对单一语言的语音情感识别(SER)取得了显着的结果。但是,由于(i)源和目标域分布之间的巨大差异,(ii)少数标记和许多未标记的新语言的话语,跨语言SER仍然是现实世界中的挑战。考虑到以前的方面,我们提出了一种半监督学习方法(SSL)方法,用于跨语性情感识别时,当有一些新语言的标签可用时。基于卷积神经网络(CNN),我们的方法通过利用伪标记的策略来适应新语言。特别是,研究了使用硬和软伪标签方法的使用。我们在源和新语言上均独立于语言的设置中彻底评估了该方法的性能,并在属于不同语言菌株的五种语言中显示出其稳健性。
translated by 谷歌翻译
在本文中,我们提出了一种无监督的方法,用于高光谱遥感图像分割。该方法利用了平均移位聚类算法,该算法将作为输入的初步高光谱超像素分割以及光谱像素信息。所提出的方法不需要分割类的数量作为输入参数,也不需要利用有关要分割的土地覆盖或土地使用类型的A-Priori知识(例如水,植被,建筑等)。进行了Salinas,Salinasa,Pavia Center和Pavia University数据集的实验。绩效是根据归一化信息,调整后的RAND指数和F1得分来衡量的。结果证明了该方法与艺术状态相比的有效性。
translated by 谷歌翻译
图像的美学质量被定义为图像美的度量或欣赏。美学本质上是一个主观性的财产,但是存在一些影响它的因素,例如图像的语义含量,描述艺术方面的属性,用于射击的摄影设置等。在本文中,我们提出了一种方法基于语义含量分析,艺术风格和图像的组成的图像自动预测图像的美学。所提出的网络包括:用于语义特征的预先训练的网络,提取(骨干网);依赖于骨干功能的多层的Perceptron(MLP)网络,用于预测图像属性(attributeNet);一种自适应的HyperNetwork,可利用以前编码到attributeNet生成的嵌入的属性以预测专用于美学估计的目标网络的参数(AestheticNet)。鉴于图像,所提出的多网络能够预测:风格和组成属性,以及美学分数分布。结果三个基准数据集展示了所提出的方法的有效性,而消融研究则更好地了解所提出的网络。
translated by 谷歌翻译
We are witnessing a widespread adoption of artificial intelligence in healthcare. However, most of the advancements in deep learning (DL) in this area consider only unimodal data, neglecting other modalities. Their multimodal interpretation necessary for supporting diagnosis, prognosis and treatment decisions. In this work we present a deep architecture, explainable by design, which jointly learns modality reconstructions and sample classifications using tabular and imaging data. The explanation of the decision taken is computed by applying a latent shift that, simulates a counterfactual prediction revealing the features of each modality that contribute the most to the decision and a quantitative score indicating the modality importance. We validate our approach in the context of COVID-19 pandemic using the AIforCOVID dataset, which contains multimodal data for the early identification of patients at risk of severe outcome. The results show that the proposed method provides meaningful explanations without degrading the classification performance.
translated by 谷歌翻译
Human Activity Recognition (HAR) is one of the core research areas in mobile and wearable computing. With the application of deep learning (DL) techniques such as CNN, recognizing periodic or static activities (e.g, walking, lying, cycling, etc.) has become a well studied problem. What remains a major challenge though is the sporadic activity recognition (SAR) problem, where activities of interest tend to be non periodic, and occur less frequently when compared with the often large amount of irrelevant background activities. Recent works suggested that sequential DL models (such as LSTMs) have great potential for modeling nonperiodic behaviours, and in this paper we studied some LSTM training strategies for SAR. Specifically, we proposed two simple yet effective LSTM variants, namely delay model and inverse model, for two SAR scenarios (with and without time critical requirement). For time critical SAR, the delay model can effectively exploit predefined delay intervals (within tolerance) in form of contextual information for improved performance. For regular SAR task, the second proposed, inverse model can learn patterns from the time series in an inverse manner, which can be complementary to the forward model (i.e.,LSTM), and combining both can boost the performance. These two LSTM variants are very practical, and they can be deemed as training strategies without alteration of the LSTM fundamentals. We also studied some additional LSTM training strategies, which can further improve the accuracy. We evaluated our models on two SAR and one non-SAR datasets, and the promising results demonstrated the effectiveness of our approaches in HAR applications.
translated by 谷歌翻译
Testing Deep Learning (DL) based systems inherently requires large and representative test sets to evaluate whether DL systems generalise beyond their training datasets. Diverse Test Input Generators (TIGs) have been proposed to produce artificial inputs that expose issues of the DL systems by triggering misbehaviours. Unfortunately, such generated inputs may be invalid, i.e., not recognisable as part of the input domain, thus providing an unreliable quality assessment. Automated validators can ease the burden of manually checking the validity of inputs for human testers, although input validity is a concept difficult to formalise and, thus, automate. In this paper, we investigate to what extent TIGs can generate valid inputs, according to both automated and human validators. We conduct a large empirical study, involving 2 different automated validators, 220 human assessors, 5 different TIGs and 3 classification tasks. Our results show that 84% artificially generated inputs are valid, according to automated validators, but their expected label is not always preserved. Automated validators reach a good consensus with humans (78% accuracy), but still have limitations when dealing with feature-rich datasets.
translated by 谷歌翻译
Deep Neural Networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, among which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure-prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the Deep Learning based System (DLS) does not cause damages, despite its main functionality being suspended. In this paper, we consider DLS that implement a supervisor by means of uncertainty estimation. After overviewing the main approaches to uncertainty estimation and discussing their pros and cons, we motivate the need for a specific empirical assessment method that can deal with the experimental setting in which supervisors are used, where the accuracy of the DNN matters only as long as the supervisor lets the DLS continue to operate. Then we present a large empirical study conducted to compare the alternative approaches to uncertainty estimation. We distilled a set of guidelines for developers that are useful to incorporate a supervisor based on uncertainty monitoring into a DLS.
translated by 谷歌翻译
Emerging applications such as Deep Learning are often data-driven, thus traditional approaches based on auto-tuners are not performance effective across the wide range of inputs used in practice. In the present paper, we start an investigation of predictive models based on machine learning techniques in order to optimize Convolution Neural Networks (CNNs). As a use-case, we focus on the ARM Compute Library which provides three different implementations of the convolution operator at different numeric precision. Starting from a collation of benchmarks, we build and validate models learned by Decision Tree and naive Bayesian classifier. Preliminary experiments on Midgard-based ARM Mali GPU show that our predictive model outperforms all the convolution operators manually selected by the library.
translated by 谷歌翻译